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Abstract 

We  consider  a  series  of  numerical  examples  and  compare  several  algorithms  for  estimation  of 
coefficients  in  differential  equation  models.  Unconstrained,  constrained  and  Tikhonov  regulariza¬ 
tion  methods  are  tested  for  the  behavior  with  regard  to  both  convergence  (of  approximation  meth¬ 
ods  for  the  states  and  parameters)  and  stability  (continuity  of  the  estimates  obtained  with  respect 
to  perturbations  in  the  data  or  observed  states). 
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Introduction 


In  [1],  [2]  Banks  et  al.  describe  an  algorithm  for  solving  a  two  dimensional  version  of  the 
parameter  identification  problem  for 
(  1  )  D(qDu)  =  f,  x  e  [0,1], 

where  q  is  an  unknown  to  be  chosen  from  some  given  parameter  set  Q,  f  is  assumed  known,  and 
observations  u  are  given  for  u  =  u(q).  Here  D  =  while  in  [1],  [2]  one  replaces  D  with  an 
appropriate  2-dimensional  gradient  operator.  The  algorithm  given  in  [1],  [2]  is  based  on  spline 
approximations  u^,  q^  e  for  both  the  states  u  and  unknown  parameters  q  and  a  least 
squares  criterion 


1 


A  convergence  theory  (as  the  dimensions  of  the  approximating  spline  spaces  increase  i.e.,  N  -  <x, 

M  -  oo)  is  given  in  [1]  where  one  may  use  either  linear  or  cubic  splines  for  the  state  approxima¬ 
tions  and  cubic  splines  for  the  parameter  approximations.  An  essential  feature  of  these  particular 
convergence  proofs  is  that  the  admissible  parameter  set  Q  and  its  approximations  lie  in  some 
compact  subset  of  C[0,1].  This  same  compactness  assumption  plays  a  fundamental  role  in  proving 
stability  (  e.g.,  continuity  of  the  inverse  of  the  mapping  from  the  parameter  estimates  to  the 
observations  or  data)  as  is  discussed  in  [3],  for  example.  As  we  shall  seek  to  demonstrate  in  this 
report,  it  so  happens  that  this  compactness  is  also  important  in  computational  aspects  of  the 
algorithms. 

Perhaps  the  most  direct  way  to  interpret  the  compactness  requirement  is  in  terms  of  con¬ 
straints  on  the  parameters.  For  example,  in  the  computations  reported  herein,  we  imposed  com¬ 
pactness  in  C[0,1]  by  putting  upper  and  lower  bounds  on  the  function  values  as  well  as  an  upper 
bound  on  the  absolute  values  of  the  slope  of  the  functions.  The  results  presented  in  this  paper 
illustrate  the  apparent  necessity  in  many  examples  of  including  such  a  compactness  constraint  in 
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computational  algorithms.  Examples  are  given  below  where  both  stability  and  convergence  prop¬ 
erties  are  as  expected  whenever  a  constrained  estimation  procedure  is  employed  whereas  instabil¬ 
ity  and  divergence  is  in  evidence  when  unconstrained  techniques  are  used.  It  is  safe  to  speculate 
that  similar  behavior  occurs  in  problems  with  parabolic  and  hyperbolic  systems  as  well  as  elliptic. 
In  our  own  work  and  in  that  reported  in  the  literature  -  e.g.  see  Yoon  and  Yeh  [7],  one  sometimes 
encounters  severe  problems  with  oscillations  in  the  estimates  for  q  as  one  pushes  the  algorithms 
for  increased  accuracy  in  the  parameter  estimates.  As  the  examples  in  this  report  demonstrate, 
these  difficulties  can  to  some  extent  be  alleviated  by  imposition  of  compactness  constraints. 

An  alternative  but  essentially  theoretically  equivalent  approach  involves  the  use  of  Tikhonov 
regularization  as  formulated  by  Kravaris  and  Seinfeld  in  [5],  One  restricts  the  parameter  set  to 
Qpj  =  Q  with  compactly  imbedded  in  Q  and  then  modifies  the  original  least  squares  criterion  J 
to  minimize  =  J  +  0  |  q|  ^  where  Q  -||  ^  is  the  norm  in  Qj^  and  0  is  a  regularization  parame¬ 
ter.  Thus  minimizing  sequences  for  are  bounded  in  Qp  and  hence  compact  in  Q;  this  is,  in  some 
sense,  roughly  equivalent  to  minimizing  J  over  a  restriction  of  Q  which  is  compact  ever,  though  the 
minimization  of  only  produces  (hopefully)  an  approximation  to  the  minimizer  for  the  original 

criterion  J.  In  the  cases  considered  below,  we  use  Qj^  =  H  while  Q  =  C  (which  corresponds  to 
1  2 

A  =  C  and  R  =H  in  the  notation  of  [5]). 

As  we  shall  see  below,  each  approach  has  inherent  difficu!  _s  in  choosing  related  imbedding 
parameters:  in  the  first,  the  estimates  produced  are  sensitive  to  the  constraints  (the  bound  L  on 
the  derivatives  of  the  parameters  in  the  computations  summarized  in  this  report)  while  the  esti¬ 
mates  produced  using  regularization  are  quite  sensitive  to  the  regularization  parameter  0 . 

In  our  calculations  we  have  attempted  to  compare  the  spline  based  algorithms  on  a  number  of 
examples  for  three  cases:  the  unconstrained  minimization  of  J;  the  constrained  minimization  of  J; 
and  unconstrained  minimization  of  a  regularized  criterion  J ^ . 

We  briefly  outline  the  algorithms  we  have  used,  deferring  some  of  the  details  to  Appendices. 


The  Estimation  Algorithms 


The  methods  we  illustrate  here  are  based  on  linear  spline  approximations  for  the  states  u  in 
(1)  as  well  as  for  the  parameter  q.  Full  details  of  the  approximation  schemes  are  given  elsewhere 
(e.g.  see  [11,  [2]  and  some  of  the  references  contained  therein).  We  consider  (1)  in  weak  form  and 
use  the  approximate  equations 


(  3  )  <qDu,D0j>  +  <f,0j>  =  0, 

where  0j  are  the  N-l  spline  basis  elements 

x  —  (i— 1)/N, 

(  4  )  0  j(x)  =  (i+  1)/N  -  x. 

0 


i=  1,  .  .  ,N-1 

xe  [  (i —  1  )/N,  i/N  ) 
xe[  i/N,  (i+l)/N  ) 
otherwise. 


The  aim  is  to  find  an  approximate  estimate  of  q, 


(  5  ) 


M  _  v  . 
q  -  I  p j'J'j 
j=i  J  J 


M+l 


where  the  are  the  restrictions  to  [0, 1] 

x  -  (j — 2)/M, 

(  6  )  0j(x)  =  j/M  -  x, 

0 


of  the  basis  elements 

xe[(j-2)/M,  ( j— 1)/M  ) 
xe[  (j-D/M,  j/M  ) 
otherwise. 


These  spline  basis  elements  are  not  normalised  to  have  a  maximum  value  of  one.  In  the 
unnormalised  form  the  slopes  are  plus  and  minus  one,  which  makes  it  very  easy  to  express  con¬ 
straints  on  the  slope  as  the  difference  between  the  coefficients  of  successive  basis  elements,  with¬ 
out  any  scaling  factors  involving  N  or  M. 


vwT>>i'vvv  vv .-v  r- 


Given  an  estimate  q^  and  substituting  it  into  (3)  we  can  solve  for  the  corresponding  u^, 


where 


(7)  u  =  I  ox<t>\ 

i=  1 

Thus  u*  is  a  function  of  q^,  i.e.  u^(q^)  (see  Appendix  B). 


The  estimation  algorithms  estimate  q  by  optimizing  over  q  V1  in  a  specified  subset  of  C[0.1] 


the  functional 


V>  =  J  luVWdx 


The  Unconstrained  Algorithm 


In  this  algorithm  the  functional  is  taken  to  be 


JN(qM)  =  j  |uN(qM)-0|  2dx 


This  functional  was  optimized  using  the  reduced  gradient  algorithm  (Appendix  A),  without  impos¬ 
ing  any  constraints. 


The  Constrained  Algorithm 


This  algorithm  is  similar  to  the  unconstrained  algorithm,  but  the  estimates,  q  ,  are  constrained  to 
a  compact  subset  of  C[0,1].  A  suitable  compact  subset  of  C[0,1]  is  the  set  of  bounded  functions 
with  bounded  derivative  almost  everywhere,  so  the  constraints  imposed  on  q^  were  |  Dq^|  <L, 
and  0.5Sq^S  10.0.  The  functional  (9)  was  minimized  using  the  reduced  gradient  algorithm 
(Appendix  A)  with  these  constraints. 


'  ‘  -V-‘  -  V  *  *-  .*•  ’•  .  v  ••  *.  s' ’•  •.  .%  /•  a/*,  -  .■»  *. a  . 


The  bound  on  the  absolute  value  of  the  derivative  is  the  parameter  L.  The  appropriate  value 


of  this  parameter  to  be  used  must  also  be  estimated,  either  from  apriori  knowledge  about  the 
parameter  q.  or  by  looking  at  the  behavior  of  the  estimates  as  the  constraints  are  eased,  i.e..  L  is 
increased. 

The  constraint  that  the  function  be  bounded  is  never  significant  in  the  work  presented  in  this 
paper;  however  as  we  shall  see,  the  results  are  quite  sensitive  to  the  derivative  bound  L. 


Tikhonov  Regularization 


Tikhonov  regularization  is  one  method  which  has  been  proposed  [5]  to  prevent  the  oscillations  in 
the  estimates.  The  estimation  problem  is  changed  by  optimizing  over  a  different  functional 
J^(q^),  where  ; 

(10  1  J?(qM)  =  JN(qM)  +  /3lqMfia 

and 


(  11  ) 


(  12  ) 


|  uN(qMl-u  |  2  dx 
1 

[  q^  |  2  dx  +  ("  |  Dq^  |  2  dx 
^  0 


This  functional  was  mimimized  using  the  reduced  gradient  algorithm  (Appendix  A)  without 
imposing  any  constraints. 

As  we  have  indicated,  the  addition  of  p  5  q^  j  ^  essentially  constrains  the  estimates  to  a  com¬ 
pact  subset  of  the  estimation  space,  but  it  also  creates  some  bias  in  the  estimate.  In  the  limit  as 
N,  M  -»  ce  one  cannot  expect  to  converge  to  the  true  parameter  unless  (3  -  0  as  N  and  M  -  ® 


'Data”  For  The  Algorithm 


Throughout  this  work  we  assume  that  we  know  the  functions  u,q  and  f  as  continuous  func¬ 
tions  of  x.  That  is,  in  the  test  examples,  we  are  given  (independent  of  the  approximation  indices  N 
and  M)  values  for  x-u(x)  (though  possibly  with  some  error),  not  just  a  finite  dimensional  approxi¬ 
mation  to  it.  Thus  the  least  squares  functional  (9)  is  evaluated  using  an  infinite  dimensional  value 
for  u.  We  do  not  look  at  the  effect  of  knowing  only  a  finite  dimensional  approximation  to  u,  as 
would  be  the  case  if  u  were  observed  only  for  some  values  of  x  and  then  an  interpolated  function 
were  used  for  the  observations  u. 


The  Error  Measure:  L,2  vs  L® 

In  some  of  the  results  presented  in  this  paper  we  have  looked  at  the  L®  norm  of  the  estimate 

2 

error.  We  could  also  have  used  the  L  norm  of  the  estimate  error. 

It  is  possible  to  have  convergence  of  an  estimate  in  the  L2  norm  but  not  in  the  L®  norm  but 

2 

not  visa  versa,  of  course.  However  in  the  examples  studied  here  the  L  norm  of  the  error  has 
behaved  similarly  to  the  L®  norm.  The  oscillations  which  have  developed  have  not  tended  to  occur 
over  a  shorter  length  as  their  height  increases,  thus  the  L®  and  L2  norms  of  the  error  have 
increased  together. 

2 

On  those  figures  which  show  the  changes  in  estimate  error  with  iteration  number  both  the  L 
and  L®  norms  of  the  estimate  errors  are  given. 


Examples 


The  following  examples  were  used  in  testing  the  algorithms.  For  examples  2  through  5,  f 
tends  to  infinity  near  x  =  0,  and  thus  the  calculation  (see  Appendix  C)  of  F1  (=  <f,$  1  >)  using 
Simpson’s  rule  cannot  be  expected  to  be  accurate.  Therefore,  for  these  examples  Ft  was  evaluated 
using  its  analytic  expression. 

Example  1 

(  13  ) 

(  14  ) 


q  =  2  +  x  xe  [0,1/3] 

8/3-x  xe  [1/3,  1] 

u  =  sin(irx) 


Example  2 


2  +  x, 

(  15  )  q  =  8/3-x. 

4/3 +  x, 

(16)  u=/x(l-/x) 


Example  3 


2  +  2*x, 

(  17  )  q  =  10/3  -  2*x, 

2/3  +  2*x, 
(18)  u=/x(l-/x) 


xe  [0,1/3) 
xe  [1/3, 2/3) 
xe  [2/3,1] 


xe  [0,1/3) 
xe  [1/3, 2/3) 
xe [2/3,1] 


•a  •- 


Example  4 


(19)  q  =  2,  xe  [0,1] 

(20)  u=/x(l-/x) 

Example  5 

2  +  x, 

(21)  q  =  -3*x2 +3*x  +  5/3, 

4/3 +x, 

(22)  u  =  /x(l-/x) 

M 

The  initial  guess  for  q 

M 

The  initial  guess  for  q(x)  in  all  Examples  was  q  0  (x)=  1 


xe  [0,1/3) 
xe [1/3, 2/3) 
xe [2/3,1] 
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The  Constrained  versus  Unconstrained  Algorithms 

In  general  constraints  are  necessary  for  the  estimation  algorithm  to  work.  We  first  compare 
the  two  algorithms  taking  the  constraint  parameter  L  to  be  one,  i.e.  we  require  that  the  absolute 
value  of  the  slope  of  the  estimates  be  less  than  or  equal  to  one. 

For  Example  1  both  algorithms  perform  similarly  for  N  and  M  of  comparable  magnitude,  but 
as  M  is  increased  only  the  estimates  given  by  the  constrained  algorithm  are  accurate.  The  esti¬ 
mate  given  by  the  unconstrained  algorithm  develops  oscillations  (Fig  1  -  Fig  4). 

These  oscillations  are  reduced  as  N  is  increased,  but  for  the  algorithm  to  perform  satisfacto¬ 
rily  at  low  N  (a  case  of  importance  in  actual  applications  of  the  method  to  more  complex  problems  ) 
the  constraints  are  essential. 

The  situation  is  much  more  dramatic  with  u  taken  as  / x(l-/ x),  where  u  is  a  much  more 
sharply  curving  function  near  x  =  0  and  hence  much  harder  to  approximate  with  u  when  N  is 
small.  For  N  =  8  the  constrained  estimate  shows  little  detail  but  does  give  an  idea  of  the  mean 
value  of  the  true  estimate  (Fig  5).  The  unconstrained  estimate  is  completely  wild  and  gives  no 
useful  information  (Fig  6). 

That  the  unconstrained  estimate  is  so  bad  is  not  the  result  of  ill  choosen  convergence  criteria. 
The  best  estimate,  found  using  the  knowledge  of  the  true  q  is  shown  in  Fig  7.  The  estimate  using 
much  weaker  convergence  criteria  is  shown  in  Fig  8.  Neither  give  much  information  about  the 
true  q. 

Figures  9-11  show  the  behavior  of  the  unconstrained  estimate  as  N  is  increased.  For  N  =  96 
(Fig  11)  the  unconstrained  estimate  is  good  with  only  small  oscillations  at  each  end,  and  shows  a 
reasonably  steady  improvement  as  J  decreases,  i.e.  is  insensitive  to  the  convergence  criteria  used, 
compared  to  the  estimate  for  N=16  M  =  16  (Fig  9)  which  deteriorates  as  you  push  the  cost  (J)  to 
lower  values  (see  the  graphs  on  Figures  9  -  11  of  L*  and  L,2  vs  iterations). 


On  the  other  hand  the  constrained  algorithm  behaves  well  for  all  the  values  of  N  (Fig  12  - 
Fig  14)  and  is  insensitive  to  when  you  stop  the  iterations  even  for  the  N  =  8  M=  15  case.  (Fig  5).  It 
also  provides  consistently  better  estimates  with  far  fewer  iterations  even  when  N  =  96,  although 
the  number  of  iterations  involved  in  finding  the  unconstrained  estimate  could  be  reduced  consider¬ 
ably  by  using  a  better  unconstrained  optimization  algorithm  than  the  reduced  gradient  method. 
When  used  with  no  constraints  the  reduced  gradient  method  is  just  a  gradient  method  in  a  space 
isomorphic  to  the  estimate  space  and  will  not,  in  general,  perform  any  better  than  the  normal 
gradient  method. 


103982D-02  L-INFINITY  N0RM  0*-Q  =  0.784441D-01 


Fiq  4:  Unconstrained  Es 


ITERATION  999  J  =  CU71505D 

L2  N0RM  Q-Q  =  0.427I38D-02  L-INFINITY  N0F 


Fiq  6:  Unconstrained  Estimation  «2:  N=3  M-15 


J  =  0.3O500GD-03 
823974D-01  L-INFINITY  FORM  O' 


SNBiivaai 


L-INFINITY  N0RM  Q--Q  =  0„670483D*01 


[TERATI0N  9999  J  =  0.100169D-03 

L2  N0RM  0*-Q  =  0.714982D*00  L-INFINITY  N0RM  0*-Q  =  0.2796900*01 


me 


[TERATI0N  9999  J  =  0.541171D-05 

L2  N0RM  O'-Q  =  0.104115D-01  L-INFINITY  N0RM  Q--Q  =  0.732284D-00 


ITERATION  3922  J  =  0.69509 

L2  NORM  Q«-Q  =  0.606869D-03  L-INFINITY  Nl 


[TERATI0N 
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Finding  an  Appropriate  Value  for  L 


In  the  previous  section  we  choose  the  constraint  parameter  L  to  be  one,  which  was  the  slope 
of  the  true  parameter.  In  general  the  maximum  slope  of  the  true  parameter  will  not  be  known 
exactly,  but  must  be  estimated.  Figure  15  shows  the  behavior  of  the  estimate,  q^,  for  Example  2 
as  L  is  increased.  As  the  constraints  are  eased,  the  estimate  approaches  the  true  parameter 
(graphed  with  a  dashed  line  here  and  in  subsequent  figures),  then  as  the  constraints  are  eased 
further  the  estimates  develop  the  oscillations  characteristic  of  the  unconstrained  algorithm.  From 
the  graph  the  best  value  of  L  can  be  seen,  provided  you  have  some  apriori  knowledge  of  the  oscil¬ 
lations  which  are  present  in  the  true  parameter. 

These  examples  are  slightly  artificial  in  that  the  slope  of  q(x)  is  always  one,  thus  there  is  a 
value  of  L  which  is  exactly  the  true  magnitude  of  the  slope  everywhere.  This  is  not  the  case  for 
Example  5  which  starts  and  finishes  with  slope  one,  but  is  parabolic  for  the  middle  third.  Figure 
16  shows  the  change  in  estimates  for  this  example  as  L  is  increased,  while  Figure  17  shows  the 
change  in  estimates  as  M  is  increased.  As  can  be  seen  the  behavior  is  similar  even  though  the 


constraints  are  not  exact  everywhere. 
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Finding  Appropriate  Values  for  a  and  0 

Some  discusion  of  the  problem  of  finding  a  suitable  regularization  parameter  is  given  in  [5 
We  found  a  suitable  0  by  trying  a  wide  range  of  values.  Figure  18  shows  the  estimates  for  vari¬ 
ous  values  of  0  •  It  can  be  seen  that  as  /3  is  decreased  the  estimates  converge  towards  the  true 
answer  and  then  start  to  develop  the  oscillations  typical  of  the  unconstrained  method.  From  the 
graph  the  best  value  of  /3  can  be  seen,  without  knowing  the  true  parameter,  provided  some 
assumptions  are  made  about  the  sort  of  oscillations  that  are  present  in  the  true  parameter. 

Figure  19  shows  the  L°°  norm  of  the  error  versus  (}  for  several  values  of  a.  Note  that  the 
estimate  is  much  less  sensitive  to  the  value  of  p  as  a  is  decreased.  Reducing  a  has  the  effect  of 
limiting  the  slope  of  the  estimate  more  than  the  absolute  value  of  the  function. 

For  the  work  in  this  paper,  comparing  the  various  algorithms,  we  used  our  knowledge  of  the 


true  q  to  choose  a  suitable  value  of  @ . 


0.6  0.8 


The  Effect  of  ChanqLnq  Alpha 


The  Tikhonov  versus  Constrained  Estimation  Algorithms 


For  low  N  the  behavior  of  the  Tikhonov  estimation  algorithm  is  similar  to  that  of  the  con¬ 
strained  algorithm.  For  Example  2  both  are  stable  as  M  is  increased  (Fig  20)  and  give  similar 
estimates,  provided  0,  a  and  L  are  suitable  values.  This  is  in  contrast  to  the  unconstrained  esti¬ 
mates,  which  tend  to  develop  oscillations  as  M  is  increased.  However  with  low  N  both  algorithms 
give  an  estimate  with  no  real  detail  (Figs  21  and  22). 

As  N  is  increased  while  holding  M  fixed  the  estimates  for  all  three  algorithms  improve  (Fig 
23),  however  the  constrained  algorithm  performs  best,  while  the  Tikhonov  algorithm  tends  to  pro¬ 
duce  a  flattened  estimate  (Fig  24). 

The  bias  in  the  Tikhonov  method,  evidenced  by  an  estimate  which  is  somewhat  flatter  than 
the  true  parameter,  is  more  marked  the  more  strongly  varying  q^  is.  Figure  25  shows  the 
improvement  in  the  estimate  as  N  is  increased  for  Example  3.  The  constrained  algorithm  performs 
much  better  than  the  Tikhonov  algorithm  which  produces  a  badly  biased  estimate,  even  for  the 
best  (},  chosen  knowing  the  true  parameter.  When  j3  is  reduced,  to  reduce  the  bias,  oscillations 
develop  in  the  estimate  before  it  gets  near  to  the  true  parameter  (Fig  26). 

The  constrained  estimation  algorithm  does  not  have  this  problem.  As  L  is  increased  the  esti¬ 
mates  come  close  to  the  true  parameter  and  only  then  develop  oscillations  (Fig  27). 

For  a  very  flat  function  the  Tikhonov  method  works  well.  The  parameter  q  in  Example  4  is 
just  a  constant  function.  As  0  is  decreased,  with  sufficiently  large  N,  the  Tikhonov  estimates  con¬ 
verge  to  the  true  parameter  and  only  then  start  to  develop  oscillations  as  0  is  decreased  further 
(Fig  28). 

For  this  flat  function  the  constrained  algorithm  also  works  well,  if  given  much  tighter  con¬ 
straints  than  in  the  previous  examples.  The  estimates  for  various  values  of  L  are  showm  in  Figure 
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Constrained  Estimates  oP  Q  a4i  N  =  64 


The  Stability  of  the  Estimates  With  Perturbations  of  U 


To  test  the  stability  of  our  algorithms  with  respect  to  noise  in  the  data,  we  took  the  true  u,q 
and  f  in  Example  2,  but  perturbed  u  to  produce  the  data  u  =  Up  in  the  cost  criterion  J.  We  used  a 
perturbation  of  the  form 

Up(x)  =  u(x)  +  p(x)/K 

where  p(x)  is  the  perturbing  function,  and  K  determines  the  size  of  the  perturbation.  As  K  is 
increased  Up(x)  -  u(x).  We  used  two  perturbation  functions: 

(a)  p1(x)  =  x(l-x) 

(b)  P2(x)=  1 

Note  that  pj(x)  satisfies  the  same  boundary  conditions  as  u  while  P2<x)  does  not. 

The  unconstrained,  constrained  and  Tikhonov  estimates  for  several  values  of  K  with  pertur¬ 
bation  function  pj(x)  and  N  =  8  and  M=  15  are  shown  in  Figures  30  -  32.  With  the  constrained 
and  Tikhonov  algorithms  the  estimates  improve  steadily  as  the  error  in  the  input  data,  u,  is 
reduced,  while  the  unconstrained  estimates  are  bad  even  with  the  exact  data.  The  L°°  norm  of  the 
errors  in  the  Final  estimates  versus  K  for  N  =  8  M=15,  N=16  M=16  and  N  =  64  M=16  for  both 
perturbations  are  shown  in  Figures  33  through  38.  Figure  33  also  shows  the  results  for  the 
unconstrained  algorithm  when  weaker  convergence  criteria  are  used. 

For  all  the  values  of  N  the  behavior  of  the  constrained  and  Tikhonov  methods  are  similar, 
with  the  estimate  improving  steadily  as  Up(x)  -  u(x),  the  true  solution. 

For  small  N,  the  unconstrained  estimates  actually  get  worse  as  Up  approaches  u(x).  and  for 
all  perturbations  these  estimates  are  worse  than  the  initial  guess,  which  has  an  error  in  the  L* 
norm  of  4/3.  For  large  N  the  unconstrained  estimates  are  stable  with  respect  to  perturbations  of 
the  input  data. 


•  t*m'  tk-  MU  i  k  ‘|La| 


By  comparing  Figures  35  and  38  it  can  be  seen  that  perturbation  2,  which  does  not  satisfy 
the  same  boundary  conditions  as  u.  has  a  much  greater  effect  on  the  estimate  error. 
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The  Convergence  Criteria 

In  finding  the  estimates  the  reduced  gradient  algorithm  was  run  until  one  of  the  following 
conditions  was  met: 

(a)  the  sequences  and  Jj  both  converged,  where  Xj  was  said  to  have  converged  if : 

lxrxi-5l/lxil  s  10'5  =  e 

or  (ii)  Ixj-xj.-l  <  10'14 

where  ||  xj  is  the  L2  norm  squared  if  x  is  q^  and  is  just  J(q^)  if  x  is  J(q^). 

(b)  the  number  of  iterations  exceeeded  999  (or  in  some  cases  9999) 

In  several  of  the  examples,  especially  unconstrained  examples,  the  best  estimate  does  not 
occur  when  J  is  minimized.  The  convergence  criteria  used  in  the  work  for  this  paper  were  very 
tight,  and  it  is  possible  that  in  using  such  tight  criteria  the  optimization  of  J  was  taken  too  far. 
Thus  the  oscillations  seen  in  the  above  examples  could  be  a  manifestation  of  numerical  inaccura¬ 
cies  which  became  significant  when  reductions  in  the  cost  J  at  each  stage  of  the  optimization 
algorithm  were  comparable  to  the  error  in  the  evaluation  of  J. 

To  check  this  effect  several  of  the  examples  where  the  estimates  do  deteriorate  significantly 
as  J  is  reduced  were  rerun  with  the  following  weaker  convergence  criteria. 

The  reduced  gradient  algorithm  was  run  until  one  of  the  following  conditions  was  met: 

(a)  the  sequences  q^  and  Jj  both  converged,  where  Xj  was  said  to  have  converged  if : 

(»)  l*rxi-5l/lxil  s  10'2  =  e 

or  (ii)  |xrxj.5|  £  10‘8 
where  |  |  is  as  described  above. 

(b)  the  number  of  iterations  exceeeded  999 


5' 


Figures  39  through  41  show  the  results.  In  general  the  estimates  are  slightly  better  but  still 
show  the  same  defects,  just  not  as  highly  developed.  Even  using  the  knowledge  of  the  true  param¬ 
eter  to  Find  the  best  estimate  (best  meaning  the  estimate  in  the  sequence  wrh  the  smallest  error 
in  the  L06  norm)  does  not  produce  a  good  estimate,  just  one  with  the  same  faults  at  an  earlier 
stage.  We  can  therefore  conclude  that  the  convergence  criteria  used  here,  which  are  stricter  than 
would  be  used  if  computing  time  were  significant,  are  not  producing  results  with  errors  due  to 
numerical  inaccuracies.  Rather  the  deficiencies  in  the  estimates  are  true  manifestations  of  algor¬ 
ithm  weaknesses. 


Conclusions 


The  work  in  this  paper  demonstrates  severe  problems  in  some  instances  with  using  an 
unconstrained  algorithm  to  estimate  the  parameter  q.  When  modified,  either  by  regularizing  the 
problem  using  Tikhonov  regularization  or  by  constraining  the  estimate  set  as  in  this  papgfT  the 
algorithm  does  give  good  estimates. 

Unlike  the  unconstrained  algorithm,  both  the  Tikhonov  and  constrained  algorithms  are  stable 
with  respect  to  increasing  M  while  holding  N  fixed.  However  as  N  is  increased  the  estimates  from 
the  Tikhonov  algorithm  do  not  improve  as  much  as  do  those  of  the  constrained  algorithm.  The 
Tikhonov  estimates  are  biased  by  the  regularization  of  the  cost  functional,  and  never  show  all  the 
detail  of  q  when  q  has  significant  variation. 

Both  the  constrained  and  Tikhonov  estimation  algorithms  are  stable  with  respect  to  system¬ 
atic  errors  in  the  input  data,  while,  except  when  N  is  large,  the  unconstrained  algorithm  fails  to 
give  good  results  on  even  the  exact  data. 

For  both  the  Tikhonov  and  constrained  algorithms  there  are  parameters  which  affect  the 

algorithm’s  performance.  For  the  constrained  algorithm  suitable  constraints  must  be  found  while 

a  k 

for  the  Tikhonov  algorithm  suitable  values  of  &  and  £  must  be  found.  The  constrained  algorithm 
has  the  advantage  that  the  constraints  used  here,  i.e.  limits  on  the  slope  of  q,  have  an  obvious 
meaning,  and  so  may  well  be  known  in  advance.  In  the  Tikhonov  algorithm  (}  and  a  have  no 
obvious  meaning.  They  must  be  suggested  by  looking  at  the  change  in  the  estimate  behavior  as  0 
and  a  change,  and  perhaps  using  some  apriori  knowledge  about  the  shape  of  q  to  choose  values  of 
(l  and  a  that  give  an  estimate  that  is  neither  too  fiat,  nor  too  oscillatory. 
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Appendix  A  The  Reduced  Gradient  Optimization  Algorithm 


For  each  M,  q  V1  is  represented  as  a  linear  combination  of  linear  splines.  The  set  of  such 


functions  is  a  subspace  of  CfO.l]  which  is  isomorphic  to  R  The  bounded  derivative  condition 


in  C[0.1]  becomes  a  condition  on  the  difference  between  successive  coefficients  in  R  ^  +  \  i.e. 


|  pJ  +  j-pj|  must  be  buunded.  The  condition  that  the  function  be  bounded  becomes  a  condition  on 
the  values  of  the  individual  coefficients,  i.e.  j  pj|  must  be  bounded. 


Thus  the  minimization  is  over  a  subset  of  R  1  ,  this  subset  being  defined  by  a  set  of  con¬ 


straint  equations.  To  find  the  minimum  subject  to  these  constraints  the  reduced  gradient  algorithm 


[6]  was  used. 


In  the  reduced  gradient  algorithm  the  coefficients  of  q  1  are  transformed  by  a  linear  transfor¬ 


mation  to  give  coefficients  with  respect  to  a  new  (in  general,  nonorthogonal)  coordinate  system  in 


which  the  constraints  are  rectangular.  The  minimum  is  then  found  by  a  modified  gradient  algor¬ 


ithm.  At  each  iteration  there  is  a  linear  search,  the  direction  of  w’hich  is  given  by  the  gradient  of 


J(qM),  with  respect  to  the  new  coordinate  system,  modified  by  setting  to  zero  any  components 


which  would  cause  the  linear  search  to  violate  a  constraint. 


The  reduced  gradient  algorithm  was  also  used  when  there  were  no  constraints,  so  the  results 


could  be  compared  with  the  results  for  the  constrained  case.  Although  there  were  no  constraints 


the  coefficients  were  still  transformed  using  the  same  linear  transformation  as  for  the  constrained 


case.  However  in  this  case  the  linear  searches  were  in  the  direction  of  the  unmodified  gradient 


vector,  there  being  no  constraints  to  violate. 


M  M 


In  both  the  constrained  and  unconstrained  cases  a  diagonal  search  direction,  i.e.  q  ■  — q  j  _  ^ » 


the  direction  given  by  the  difference  between  the  two  previous  estimates  of  q,  was  used  whenever 


(a)  the  dot  product  of  the  two  previous  improvement  vectors  was  negative,  i.e. 


/M  M  ,  /  M  M  wn 

(q  i  -<i  i  .i>  (q  j  ,i~q  i  -2)<0- 


and  (b)  neither  q^  nor  q^.j  were  the  result  of  a  diagonal  step. 


I 


3 

3 


a 
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Appendix  B  Finding  u  as  a  function  of  q 


In  this  section  the  tensor  summation  notation  is  used,  i.e.  when  an  index  appears  as  both  a 
subscript  and  a  superscript  in  a  product  term  it  represents  the  sum  of  the  product  over  the 
repeated  index. 


Substituting  (5)  and  (7)  into  (3)  we  obtain 


M+l  N-l 

(23)  <  I  Pji/zj  I  o;  D0j,  D0k  >  +  <f,0k> 

j=l  i=l 


M+l  N-l 

I  1  pja D0k  >  +  Fk  =  0, 


k=  1,  .  .  ,N-1 


k=l,  .  .  ,N-1 


j=l  i=l 

which  is  equivalent  to 


PjMkJ°i  +  Fk  -  0 


where 


Fk  "  <f>«k> 

Mkj  =  <ifij  D0j.  0R> 

1 

=  J  tfj  D<t>  i<t>kdx 


r  p/N 

^jdx  = 

J(p-1)/N 

r(i+l)/N 


i  =  k+  l  =  p  or  k  =  i+  1— p 


ip jdx  =  1}  +  I.  ,  ,, 

J(i-1)/N  J  1+1 


otherwise. 


Thus  are  symmetric  positive  definite  tridiagonal  matrices  for  each  j.  So  Ak  =  PjMk"*  is 
also  a  symmetric  tridiagonal  matrix,  and  is  positive  definite  if  pj  >  0  for  all  j. 


has  a  unique  solution. 


The  matrix  A^,  being  an  n  x  n  symmetric  and  tridiagonal  matrix,  has  an 
LDL^-decomposition  where  L  is  lower  triangular  with  one’s  on  the  diagonal  and  one  non-zero 


lower  diagonal  and  D  is  diagonal  ([4]  with  DL  —  U). 


i=j,  l<i<n 


(  27  ) 

Hi 

-  Mp 

j  =  i—  1 ,  2<i<n 

=  0, 

otherwise. 

(  28  ) 

Dij 

=  7  i> 

i=j,  lsifin 

=  0, 

otherwise. 

Because  of  the  form  of  M^J,  AjJ.  has  the  following  special  form 

=  Xi  +  Xj+1, 

i  =  k 

(29) 

Al 

=  -  xi> 

i=k+l 

=  -  xk- 

i  =  k—  1 

where 

(30) 

Xi 

=  pfi 

With  this  form  7- 

and  are  given  by  : 

71 

=  Xj+X2 

(  31  ) 

7  i 

=  C(i)  /  C(i-l) 

2si^n 

1—i 

* 

>< 

1 

II 

2si£n 

where  C(r)  is  the  sum  of  the  r  +  1  products  of  possible  combinations  of  r  ij ’s  out  of  the  first  r  +  1 

ij’s  (i=  1  .  .  r+ 1). 

e.g.  C(2)  =  Illj  +  lilj+ljlj 


--  j-  p r_ 
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Proof 

Substituting  the  elements  of  into  the  equations  for  the  LU  decomposition  of  a  general  tri¬ 
diagonal  matrix  [4], 
the  values  of  7  ■  and  u  j  are  given  by  : 

(32)  7l  =  Xj+X2 

(33)  7j  =  xi  +  xi+ 1  -  Xi2/7i-l  i  =  2..n 

(  34  )  =  X^i.j 

Assume  (31)  is  true  for  some  i,  then  using  (33) 

(35)  7i+1  =  (Xi+1+Xi  +  2)-(Xi+1)  2  C(i-1)/C(i) 

=  C(i+1)/C(i) 

so  (31)  is  also  true  for  i+  1.  (31)  is  true  for  i  =  2  so  by  induction  it  is  true  for  2<i<n. 

The  value  of  C(i)  is  easily  calculated  from  C(i-l)  so  the  calculation  of  7  j  and  p-  and  hence  of 
the  LDL1-  decomposition  can  be  done  very  quickly  and  easily.  However  C(i)  grows  (or  declines) 
exponentially  with  i.  In  a  numerical  implementation  this  problem  can  be  solved  by  calculating 
C(i)/jLt 1  where  y  is  the  geometric  mean  of  the  Ij.  In  the  programs  used  here  u  was  approximated 
by  taking  the  geometric  mean  of  Ij,  Iint(N/2)’  anc*  Ij«q- 

t  i  N 

Given  the  LDL  -decomposition  of  A^  equation  (26)  can  be  solved  to  give  u  as  a  function  of 

M 

q  by  simple  back  substitution. 
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Appendix  C  Implementation  Details 


The  evaluation  of  integrals  by  Simpson’s  Rule 


The  quantities  J(q^),  F-  =  < f,0 •  >  and  |q^|  2  were  evaluated  using  Simpson's  rule  with 
Nl  subdivisions  of  the  interval  [0,1]  for  the  first  two  and  N2  subdivsions  for  the  last  one.  The 
quantity  jq  was  evaluated  by  sampling  the  difference  at  the  same  N2  points  used  in  the 
Simpson’s  rule  evaluation  of  the  L2  norm. 

Nl  and  N2  were  always  at  least  twice  N  and  M  respectively.  Only  the  value  of  Nl  affects 
the  algorithm.  The  accuracy  of  the  integration  will  be  most  sensitve  to  changes  in  N 1  when  one  of 
N  or  M  is  not  a  multiple  of  the  other.  In  this  case  the  product  f  ■  0  •  has  a  discontinuous  derivative 
at  the  points  i/N  and  j/M,  and  the  j/M  points  are  not,  in  general,  included  in  the  sum  for  the 
Simpson’s  rule  evaluation  of  the  integral  of  the  product  f  •  0j. 

The  value  of  N2  does  not  affect  the  convergence  of  the  algorithm,  it  affects  only  the  evalua¬ 
tion  of  the  L2  and  norms,  which  were  used  for  informational  purposes  and  for  determining 
convergence  but  not  in  the  calculation  of  the  successive  steps  of  the  optimization. 

The  implementation  of  the  algorithms 

The  algorithms  were  implemented  in  FORTRAN  on  an  IBM3081.  Double  precision  arithmetic 
was  used  throughout. 
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